Adaptive cross-contextual word embedding for word polysemy with unsupervised topic modeling

نویسندگان

چکیده

Because of its efficiency, word embedding has been widely used in many natural language processing and text modeling tasks. It aims to represent each by a vector so such that the geometry between these vectors can capture semantic correlations words. An ambiguous often have diverse meanings different contexts, quality which is called polysemy. The bulk studies aimed generate only one single for word, whereas few made small number embeddings present word. However, it hard determine exact senses as depend on contexts. To address this problem, paper proposes novel adaptive cross-contextual (ACWE) method capturing polysemy contexts based topic modeling, defined over latent interpretable space. proposed ACWE consists two main parts, first an unsupervised probabilistic model designed obtain global embeddings, represented unified Based process then devised second part learn local polysemous In fact, adaptively adjusted updated with respect tailored corresponding validated datasets collected from Wikipedia IMDb tasks including similarity, induction, interpretability, classification. Experimental results indicate does not outperform established methods, consider six popular benchmark datasets, but also yields competitive performance compared state-of-the-art deep learning-based approaches without considering Moreover, significantly improves performances classification both precision F1, visualizations semantics words demonstrate feasibility advantage

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Contextual Modeling for Meeting Translation Using Unsupervised Word Sense Disambiguation

In this paper we investigate the challenges of applying statistical machine translation to meeting conversations, with a particular view towards analyzing the importance of modeling contextual factors such as the larger discourse context and topic/domain information on translation performance. We describe the collection of a small corpus of parallel meeting data, the development of a statistica...

متن کامل

A Word-Embedding-based Sense Index for Regular Polysemy Representation

We present a method for the detection and representation of polysemous nouns, a phenomenon that has received little attention in NLP. The method is based on the exploitation of the semantic information preserved in Word Embeddings. We first prove that polysemous nouns instantiating a particular sense alternation form a separate class when clustering nouns in a lexicon. Such a class, however, do...

متن کامل

Contextual Dependencies in Unsupervised Word Segmentation

Developing better methods for segmenting continuous text into words is important for improving the processing of Asian languages, and may shed light on how humans learn to segment speech. We propose two new Bayesian word segmentation methods that assume unigram and bigram models of word dependencies respectively. The bigram model greatly outperforms the unigram model (and previous probabilistic...

متن کامل

Nonparametric Spherical Topic Modeling with Word Embeddings

Traditional topic models do not account for semantic regularities in language. Recent distributional representations of words exhibit semantic consistency over directional metrics such as cosine similarity. However, neither categorical nor Gaussian observational distributions used in existing topic models are appropriate to leverage such correlations. In this paper, we propose to use the von Mi...

متن کامل

Topic Modeling for Word Sense Induction

In this paper, we present a novel approach to Word Sense Induction which is based on topic modeling. Key to our methodology is the use of word-topic distributions as a means to estimate sense distributions. We provide these distributions as input to a clustering algorithm in order to automatically distinguish between the senses of semantically ambiguous words. The results of our evaluation expe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Knowledge Based Systems

سال: 2021

ISSN: ['1872-7409', '0950-7051']

DOI: https://doi.org/10.1016/j.knosys.2021.106827